Low-Rank Doubly Stochastic Matrix Decomposition for Cluster Analysis

نویسندگان

  • Zhirong Yang
  • Jukka Corander
  • Erkki Oja
چکیده

Cluster analysis by nonnegative low-rank approximations has experienced a remarkable progress in the past decade. However, the majority of such approximation approaches are still restricted to nonnegative matrix factorization (NMF) and suffer from the following two drawbacks: 1) they are unable to produce balanced partitions for large-scale manifold data which are common in real-world clustering tasks; 2) most existing NMF-type clustering methods cannot automatically determine the number of clusters. We propose a new low-rank learning method to address these two problems, which is beyond matrix factorization. Our method approximately decomposes a sparse input similarity in a normalized way and its objective can be used to learn both cluster assignments and the number of clusters. For efficient optimization, we use a relaxed formulation based on Data-Cluster-Data random walk, which is also shown to be equivalent to low-rank factorization of the doublystochastically normalized cluster incidence matrix. The probabilistic cluster assignments can thus be learned with a multiplicative majorization-minimization algorithm. Experimental results show that the new method is more accurate both in terms of clustering large-scale manifold data sets and of selecting the number of clusters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving cluster analysis by co-initializations

Many modern clustering methods employ a non-convex objective function and use iterative optimization algorithms to find local minima. Thus initialization of the algorithms is very important. Conventionally the starting guess of the iterations is randomly chosen; however, such a simple initialization often leads to poor clusterings. Here we propose a new method to improve cluster analysis by com...

متن کامل

Fast and Accurate Low Rank Approximation of Massive Graphs

In this paper we present a fast and accurate procedure called clustered low rank matrix approximation for massive graphs. The procedure involves a fast clustering of the graph and then approximating each cluster separately using existing methods, e.g. the singular value decomposition, or stochastic algorithms. The cluster-wise approximations are then extended to approximate the entire graph. Th...

متن کامل

Clustered low rank approximation of graphs in information science applications

In this paper we present a fast and accurate procedure called clustered low rank matrix approximation for massive graphs. The procedure involves a fast clustering of the graph and then approximates each cluster separately using existing methods, e.g. the singular value decomposition, or stochastic algorithms. The cluster-wise approximations are then extended to approximate the entire graph. Thi...

متن کامل

Some results on the symmetric doubly stochastic inverse eigenvalue problem

‎The symmetric doubly stochastic inverse eigenvalue problem (hereafter SDIEP) is to determine the necessary and sufficient conditions for an $n$-tuple $sigma=(1,lambda_{2},lambda_{3},ldots,lambda_{n})in mathbb{R}^{n}$ with $|lambda_{i}|leq 1,~i=1,2,ldots,n$‎, ‎to be the spectrum of an $ntimes n$ symmetric doubly stochastic matrix $A$‎. ‎If there exists an $ntimes n$ symmetric doubly stochastic ...

متن کامل

Stochastic bounds with a low rank decomposition

We investigate how we can bound a Discrete Time Markov Chain (DTMC) by a stochastic matrix with a low rank decomposition. In the first part of the paper we show the links with previous results for matrices with a decomposition of size 1 or 2. Then we show how the complexity of the analysis for steady-state and transient distributions can be simplified when we take into account the decomposition...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 17  شماره 

صفحات  -

تاریخ انتشار 2016